Structural Equation Modeling in HCI Research using SEMinR

Part 2 - Constructs and Estimation

André Calero Valdez

University of Lübeck

Lilian Kojan

University of Lübeck

Nicholas Danks

Trinity College Dublin

Soumya Ray

National Tsing Hua University

Background

Why structural equation modeling?

As researchers in HCI or in the social sciences, we study human properties (operationalized as variables) and how they are related.

  • Human variables: Often impossible to measure directly.
  • How: Ideally, what causes changes in a certain variable?

Measurement

From properties to variables

… using survey items: Structural equation modeling allows us to examine the relationship between items (the indicators) and variables (the constructs).

Causal relationships?

If we research relationships between variables, we are often interested in causality. Structural equation modeling uses regression to model relationships between variables (or constructs). But regression does not imply causality.

The gold standard for examining causality in science is the randomized controlled trial. But in HCI research, we often only have observational data.

From association …

… to causation?

Causal inference and observational data?

But not all is lost.

Causal interpretation is possible if

  • there is strong support of that interpretation by theory and previous evidence
  • the model design circumvents different interpretations

using causal modeling techniques, e.g., Directed Acyclic Graphs (DAGs) or Structural Equation Modeling

Further reading: The Causal Foundations of Structural Equation Modeling. [2]

Benefits of SEM

Structural equation modeling then allows us to examine two relationships at the same time:

  • The relationship between measurements and variables–or to use SEM terminology, between indicators and constructs.
  • And the relationship between different constructs.

These relationships are defined in two models that are linked in the estimation process:

  • The measurement model and the structural model.

How to do SEM

Process of SEM estimation

We know take you through the process of SEM estimation as you would go through it in your own research.

  1. Define your research question
  2. Define structural model
  3. Define measurement model
  4. Gather data
  5. Prepare data
  6. Estimate model
  7. Bootstrap model

Define RQ

  • You start your process by thinking about your research questions.
  • Imagine you want to know what causes adoption of your software?

We have models to predict behavioral intention (BI).

  • For example: UTAUT2

UTAUT2

Unified Theory of Acceptance and Use of Technology [3]

UTAUT2 and Trello

We are interested in examing UTAUT2 for Trello, specifically our RQ is:

  • What determines BI for using Trello?

UTAUT2 has a strong theory and evidence base. This gives us some support for causal interpretation.

Structural models

Build a structural model

From your research questions, you will derive some hypotheses.

We hypothesize that Performance Expectancey (PE), Effort Expectancy (EE), and Social Influence (SI) influence BI.

Build a structural model

So far this is equivalent to a regular multiple regression.

But we might also hypothesize that

  • the number of navigation errors influences EE
  • and that number of successful tasks influence PE

These hypotheses can be directly translated into code. And this code can be visualized as a DAG.

New model

Measurement Models

Build a measurement model

Now, we think about how we want to measure these constructs.

Luckily, for UTAUT2, there are some standardized items so that part is made easy for us.

Example - Behavioral Intention (Scale 1-7):

  • INT1: I intend to use Trello in the next month.
  • INT2: I predict I would use Trello in the next month.
  • INT3: I plan to use Trello in the next month.

Data Dictionairy

Excursus: Measurement theory

There is a distinction between reflective and formative measurement.

  • Reflective means the scale is the result of a latent construct (openness).

  • Formative means the scale is the construct (e.g. IQ).

These measurement concepts can correspond to different mathematical estimation techniques.

Excursus: Measurement theory

Most PLS-based SEM use formative scales.

Different software frameworks may use different terminology (e.g. SmartPLS, SEMinR)

SmartPLS

reflective constructs

formative constructs

PLSc

SEMinR

Composite Mode A (cor)

Composite Mode B (reg)

Reflective

Build a measurement model

For this exercise, we assume that all our constructs are type A composite constructs.

Measurement models

Items (manifest variables) are often called as the abbreviation and a number (e.g. PE1).

Constructs (latent variables) can have a single or multiple items.

The names should correspond to the column names in our data.

The specified model

Data

Gather data

This model can be used for preregistration (e.g. using OSF).

Data collection is simplified using the prepared scales (survey, experiments).

Here, we collected data for you [1].

Prepare data

  • recode data (numerical data)
  • treat missing data (or SEMinR will do it for you!)
  • rename variables (naming scheme)

Model Estimation

Estimate model

Two main questions:

  • Do our items measure sensible constructs?
  • Does our data support the hypothesized relationships?

Single function call in SEMinR

Estimation using SEMinR

Estimation Technique

SEMinR uses Partial-Least Squares (PLS) estimation

  • No need for normally distributed data
  • Smaller sample sizes produce usable results

Outputs

Measurement model:

  • Loadings as \(\lambda\) or weights as \(\omega\) values

Structural model:

  • Path coefficients as \(\beta\) values
  • Variance explained as \(r^2\)

Are these values “significant”?

Bootstrapping

Bootstrap a model

What is bootstrappingg?

  • Repeat the estimation process \(n\) times
  • Determine empirically 95% confidence intervals for parameters
  • Also provides p-values, using t-distribution (assumes normality)

Bootstrap a model

References

[1]
Danks, N.P. et al. 2023. The composite overfit analysis framework: Assessing the out-of-sample generalizability of construct-based models using predictive deviance, deviance trees, and unstable paths. Management Science. (2023).
[2]
Pearl, J. 2012. The causal foundations of structural equation modeling. Handbook of structural equation modeling. (2012), 68–91.
[3]
Venkatesh, V. et al. 2012. Consumer acceptance and use of information technology: Extending the unified theory of acceptance and use of technology. MIS quarterly. (2012), 157–178.